Hi, welcome to this second lesson.
By the end of this lesson,
The plan of this lesson is the following:
To select the dataset that we want to explore, let’s first go to the
Data - Datasets page.
There, you can see the list of all public datasets, and you can click on the name of the dataset you want to explore.
For this lesson, we will use the example of the foph_hosp_d dataset. This dataset represents the daily hospitalisations of COVID-19 in Switzerland according to the FOPH (Federal Office of Public Health).
NB: We can collapse the left panel to focus on the analysis (middle panel) and visualisation of the data (right panel), selecting the arrow icon highlighted below:

The middle panel of the page has three different subsections:
Chart type, Time, and Query.

By default, the Visualisation Type is a
Table. To see the different ways you can display your
data and potentially change the chart type, click on the current
Visualisation Type (Table).
You can see that a large number of other visualisations are available, such as Time-series, Box Plots, Maps, Bar Charts, Line Charts.
To select a visualisation, you can either explore All charts available, the most Popular, or by Category.
For instance, for Distribution charts, you may choose between Histograms and Box Plots; among the Evolution charts category, you can choose between Line Charts, and different types of Time-series; the Maps category include Country Maps and World Maps.
In the Time section, three key fields are defined:
TIME COLUMN to be used for visualisation,TIME RANGE of interest, e.g., no
filter, last quarter, previous
calendar year, or more custom and
advanced time ranges,TIME GRAIN (granularity) to be used for instance in
time evolution charts, e.g., daily, weekly, monthly.By default we will now explore all (i.e. unfiltered by time) daily data, based on the date column in our table.
Let’s now have a closer look at the Query section. This
section presents two different ways of exploring the data (QUERY MODE)
represented by the AGGREGATE and RAW
RECORDS options.

AGGREGATE query mode allows the user to simplify
the information of multiple rows of a dataset using some criteria as the
sum of the values or average (there are other options). This aggregation
is made according to some columns in the dataset.RAW RECORDS option, you will just work with rows
of the dataset without applying any aggregation operation.In this section, we will first see an example with the
RAW RECORD mode, as a first exploration step of our
dataset. When this option is selected you will visualise the fields
below.
date,
geoRegion, entries and sumTotal
columns. To select a column, click on the COLUMNS
field, a drop-down list will appear, with all columns you can select
from.
Go through this list, and select your columns of interest (here
date, geoRegion, entries and
sumTotal). The __COLUMNS_ field should look as follows:
After selecting these columns in the COLUMNS field,
you will notice that a button labeled RUN QUERY will appear
on the the right panel of the Explore page.
Under the table chart, Data resulting from your
query also appear, with the number of data rows retrieved (here
10k rows retrieved).
Once you get the table, if you don’t want to make any new explorations, you can export this table as a .JSON or .CSV file by pressing the button in the right upper corner.
But, we can still explore many other features, to get an insight on the COVID hospitalisations in Switzerland.
Right below the COLUMNS field, there is a field called ORDERING. In this field, you can order the dataset in ascending or decreasing order according to some column values.
For example, let’s order the dataset by the date
column, in ascending order; click on ORDERING field and
select data [asc].
We can add a second ordering level. For example, for each date, we
can further order the rows by number of entries, in
descending order; click again on the ORDERING field,
and select entries [desc]. You should then get this
result:
Let’s
explore some of these filters.
+ Add filter the following window will
show up on your screen:

column that you
want to apply the filter. This option can be filled with any column of
the dataset.operator you
want to apply to the data.value of reference for your
filter.We have the following operators available:
equals: You will use this operator to select
only the rows of a dataset with a specific value. The operator equals
can be applied for all types of columns. For example, in the case of the
foph_hosp_d dataset, you can use this filter if you just
want to get the values in which
geoRegion equals GE.
not equals: We can use this operator to avoid
some value. For example, if you don’t want to get the rows with entries
equal a 0 you can apply the following filter:
entries not equals 0.
< : This operator only can be applied in numeric columns. You should use it when you want the values returned in a column to be lower than some threshold defined in the filter value.
> : This operator only can be applied in numeric columns. You should use it when you want the values returned in a column to be bigger than some threshold defined in the filter value.
< = : This operator only can be applied in numeric columns. You should use them when you want the values returned in a column to be lower or equal to a threshold defined in the filter value.
> = : This operator only can be applied in numeric columns. You should use them when you want the values returned in a column to be bigger or equal to a threshold defined in the filter value.
IN : This operator can be used to select values
in a column equal to some group of values. For example, in our example
dataset, you could want to select the rows where the
geoRegion in (GE, FR). Note that you can choose as many
values as you want.
NOT IN: This operator makes the inverse of the
IN operator. This filter is used when you want the table
returned to exclude the values added in the filter value.
LIKE (case insensitive) and LIKE : This operator can be used to select values in columns with the text values following some pattern. The difference between the LIKE (case insensitive) and the LIKE is that the first operator (insensitive) doesn’t differentiate between the upper and lower cases.
IS NOT NULL : This operator is used to select
only rows with non-null values according to some
column.
IS NULL : This operator is used to select only
rows with null values according to some column.
foph_hosp_d dataset, let’s
use this filter to get the rows in which
geoRegion equals GE, as follows:
After configuring the filter you need to press the SAVE
button and the RUN QUERY button again. You can see the
result of applying this filter below:
In our platform, we can apply multiple filters at the same time.
For example, in the dataset foph_hosp_d, the column
sumtotal represents the cumulative number of
hospitalisations. Let’ filter the data to keep only rows where the
cumulative number of hospitalisations exceeds 100.
To do so, click on the + icon on the right of
FILTERS field, and add the filter sumtotal >= 100), as
shown below:
And that’s the result:
When you scroll down the table chart, you can already see clearly the first COVID wave, which started end of March 2020, and the more impressive next waves, i.e. around October-November 2020, and beginning of 2022.
We can give a title to the chart, for example
Daily COVID Hospitalisations in Geneva. To do so, click on
- untitled and repalce it with your new title, as
follows:
Now, you can SAVE this chart for later presentations
and dashboard development by clicking on +SAVE button on
top of the middle panel:
This will open a Save chart pop-up window, where the
CHART NAME field will be pre-filled with the title you
gave to your chart:
You can also see that there is an ADD TO DASHBOARD field, which allows you to add the chart you are saving to a dashboard. You can add in this field the name of a dashboard, whether this dashboard already exists or not. If it does not exist, it will create it before adding the current chart to it.
Let’s do that and name our first dashboard
COVID Hospitalisations in Geneva and Switzerland. You can
now click on SAVE.
CONGRATULATIONS, you have just created your first chart and dashboard :D
Let’s go quickly to the Charts menu to check that our table chart has been saved properly.
We can see here that the chart appears as expected in the list of
saved charts, with the title we just gave it, type of chart “table”,
dataset switzerland.foph_hosp_d, and that it was just
created and modified by myself logged as a “Aziza Merzouki”.
Same for the dashboard, let’s go to the Dashboards menu to check that our “COVID Hospitalisations in Geneva and Switzerland” has been created and saved properly.
Notice that the status of our dashboard is “Draft”. At the end of this lesson, we will see how we can publish it :)
If you click on your dashboard, you should see the following:
Now, suppose we are interested in following the evolution of weekly hospitalisations in Switzerland over the last 2 years, and would like to create a new chart for that.
Let’s go back the Charts page, and click on the
+CHART button in the upper right corner of the page to
create a new chart.
The page named Create a new chart opens. You must do two things here:
In the field CHOOSE A DATASET, type or select from the drop-down list
again the foph_hosp_d dataset.
In the CHOOSE CHART section, different sub-sections are available: Recommended tags, Category, and Tags.
In the Category sub-section, select the Evolution category. A number of corresponding charts will appear on the right panel. Let’s click on the first one and select the: Line Chart. It is the classic chart that visualised how metrics change over time.
We can then click on CREATE NEW CHART button in the
bottom.
When you do so, a new Chart page opens, with pre-filled
Datasetand Chart type fields.
You can see as well that for this type of chart, some parameters are mandatory. It is the case here for the METRICS field in the Query section. That is why it is colored in red, and annotated with an exclamation mark (!). Indeed, for line charts, METRICS field cannot be empty and one or many several metrics must be selected to be displayed.
As at least one METRIC is mandatory, let’s click on
+ Add metric in the METRICS field. A pop-up window appears.
If we select the
SIMPLE menu, we are requested to fill the COLUMN to be displayed and how we would like to AGGREGATE it. As we have daily new hospitalisations data in our dataset and we are interested visualising weekly hospitalisations, we will set COLUMN toentries, and AGGREGATE toSUM`,
since the number of hospitalisations (entries) per week is the SUM of
the daily hospitalisation during that week :)
Now, click SAVE in this pop-up window, then click
RUN QUERYon the right panel of the Chart page. The result
should be the following:
By default the TIME GRAIN field in the Time Section
is set to Day. As we are interested by numbers of weekly
hospitalisations, let’s change it to Week, for aggregating
entries at the weekly-level, and RUN QUERYagain. You will
get the result below, where you can directly see how it made the line
smoother.
Currently all entries in our table (independently from
geoRegion value) are summed by week. Let’s
GROUP entries BY
geoRegion, to see the evolution of the number of entries by
geoRegion (cantons).
In GROUP BY field, add geoRegion
column, and press on RUN QUERY.
The results should look like the following:
Great, we can see now that data are separated by canton! But, we also see that we have data for the whole Switzerland (geoRegion = CH) and for Switzerland and Liechtenstein (geoRegion = CHFL) all together.
In order to focus on canton data only, we must FILTER out
CH and CHFL data. In FILTERS, click on
+Add filter, and set the filter to
geoRegion NOT IN (CH, CHFL).
geoRegion, you can still add a legend to
see the correspondence between the line colors and the cantons. For
that, go to the CUSTOMIZE tab (next to DATA tab), and
tick the LEGEND box.NB: Note that the plot is interactive; you can show or hide lines from the chart by clicking or double-clicking on their associated legend item!
Time in the X AXIS LABEL field, and
Number of hospitalisations in the Y AXIS
LABEL field of the Y AXIS section.And that’s the result, where we can see the different waves during the last 2 years, by Swiss canton.
Finally, as previously done with Table Chart, let’s:
SAVE & GO TO DASHBOARD.The dashboard will open and you will see that your two charts (1) table and 2) line chart) were added, as below:
You can easily edit the format of the dashboard. To do so click on the Edit button on the top right corner of the page.
It will open the Edit Dashboard page, below:
There, you can put the charts side by side (with drag and drop) and resize them to take the whole width of the page. When done you can, SAVE it, and get the following NICE result :)
Now, let’s create a last chart.You now already know how to create a
new chart, right? :) 1. Go to Charts page, and 2.
Clicking on +CHART button.
foph_hosp_d.
2. In the CHOOSE CHART section, go to the
Map Category, and select the
Country Map in the right panel. 3. Then, click on
CREATE NEW CHART button
In the Query section of the Chart Page, fill :
Switzerland,isocode, andSUM(entries), as
we did in the previous line chart.+ symbol next
to METRIC, select the SIMPLE panel, and set
COLUMN to entries and
AGGREGATE to SUM.It should look like the following:
In this dataset, the isocode column allows us to link
the canton data to the corresponding canton polygon on the country map.
This isocode is a concatenation of the
string ‘CH-’ with the value of
geoRegion (‘CH’+geoRegion). For instance, canton
Geneva, with geoRegion = ‘GE’, has and isocode = ‘CH-GE’.
As we are interested here by canton data only, let’s again FILTER out
data where geoRegion = ‘CH’ or
geoRegion = ‘CHFL’. To do that, we will add a filter
geoRegion NOT IN (‘CH’,’CHFL’), and RUN QUERY
again, as follows:
If you are not fan of the color scheme, you can change it, by going
to Chart Options section (under the
Query section), selecting for instance
oranges in the LINEAR COLOR SCHEME field,
and RUN QUERY again, to get the following result:
It does look quite nice :) What do you think?
As we did not edit the time-related elements yet, you have now on your map the total number of hospitalisation entries per canton, since February 2020, while we are interested by last week’s hospitalisations only.
To correct that, we should go to Time section in the
middle panel, and set TIME RANGE to
last week. For that, click on No filter value
in the TIME RANGE field; this will open a pop-up window
named Edit time range.
There, select:
Last in RANGE TYPE,last week in Configure Time
Range, andAPPLY button in the bottom right of the
window. The window will close and you will be able to
RUN QUERY again.Below is the resulting map. Notice that the map is dynamic, when you pass your mouse over the cantons on the map, the name of the canton and the number of hospitalisation entries in this canton appear.
TO DO : ADD MAP (dataset not updated)
We can now :
+SAVE button in the middle
panel,SAVE & GO TO DASHBOARD.TO DO: ADD SCREAN-SHOT
When on the Dashboard page, Edit the format of the dashboard to: 1. Move the map we just added under the first two charts, and 2. Resize it to take the full width of the page.
If you have followed all the instructions above, you should get this result:
TO DO: ADD MAP
Further documentation is available on: https://superset.apache.org/docs/creating-charts-dashboards/creating-your-first-dashboard/
TO DO: GET THE RIGHT TO PUBLISH DASHBOARD
Congratulations! You reached the end of this lesson, where you have explored your first dataset, and built your first charts and dashboard :)